Transcript-level quantification was performed with lr-kallisto (v0.51.1), kallisto_optoff_k64 binary, using the GENCODE v48 human genome, transcriptome, and annotated GTF as references. Isoform abundance estimates (TxPM) and counts were imported into R where Differential Isoform Usage and splicing analyses were performed with IsoformSwitchAnalyzeR (v2.4.0).
The design matrix incorporated disease status (AD vs. CTRL) as the primary factor while adjusting for sex as a covariate. Filtering was performed with default parameters (alpha < 0.05 and absolute isoform fraction difference (dIF) > 0.1) to retain genes with expression ≥1 TPM and isoforms with non-zero expression.
Differential Isoform Usage was also assessed with significance thresholds of alpha < 0.05 and absolute dIF > 0.1. 26 total significant isoform switches were further characterized by integrating gene-level and isoform-level differential expression results from DESeq2 (v1.44.0). Functional consequences were predicted through four external computational tools: SignalP 6.0 for signal peptides, DeepTMHMM for protein topology, DeepLoc2.0 for subcellular localization, and Pfam for protein domains. We further evaluated the 26 significant isoform switches (isoform switch q value < 0.05) for switching consequences, localization shifts, and alternative splicing following the IsoformSwitchAnalyzeR vignette with default parameters.
All analyses were conducted in R (v4.4.0) and tidyverse (v2.0.0) ecosystem packages. Plots were generated with IsoformSwitchAnalyzeR or ggplot2 (v3.5.2).
## Loading required package: ggplot2
## Loading required package: limma
## Loading required package: DEXSeq
## Loading required package: BiocParallel
## Loading required package: Biobase
## Loading required package: BiocGenerics
##
## Attaching package: 'BiocGenerics'
## The following object is masked from 'package:limma':
##
## plotMA
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, aperm, append, as.data.frame, basename, cbind,
## colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
## get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
## match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
## Position, rank, rbind, Reduce, rownames, sapply, setdiff, table,
## tapply, union, unique, unsplit, which.max, which.min
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
## Loading required package: SummarizedExperiment
## Loading required package: MatrixGenerics
## Loading required package: matrixStats
##
## Attaching package: 'matrixStats'
## The following objects are masked from 'package:Biobase':
##
## anyMissing, rowMedians
##
## Attaching package: 'MatrixGenerics'
## The following objects are masked from 'package:matrixStats':
##
## colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
## colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
## colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
## colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
## colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
## colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
## colWeightedMeans, colWeightedMedians, colWeightedSds,
## colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
## rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
## rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
## rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
## rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
## rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
## rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
## rowWeightedSds, rowWeightedVars
## The following object is masked from 'package:Biobase':
##
## rowMedians
## Loading required package: GenomicRanges
## Loading required package: stats4
## Loading required package: S4Vectors
##
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:utils':
##
## findMatches
## The following objects are masked from 'package:base':
##
## expand.grid, I, unname
## Loading required package: IRanges
## Loading required package: GenomeInfoDb
## Loading required package: DESeq2
## Loading required package: AnnotationDbi
## Loading required package: RColorBrewer
## Loading required package: satuRn
## Loading required package: sva
## Loading required package: mgcv
## Loading required package: nlme
##
## Attaching package: 'nlme'
## The following object is masked from 'package:IRanges':
##
## collapse
## This is mgcv 1.9-1. For overview type 'help("mgcv-package")'.
## Loading required package: genefilter
##
## Attaching package: 'genefilter'
## The following objects are masked from 'package:MatrixGenerics':
##
## rowSds, rowVars
## The following objects are masked from 'package:matrixStats':
##
## rowSds, rowVars
## Loading required package: pfamAnalyzeR
## Loading required package: readr
##
## Attaching package: 'readr'
## The following object is masked from 'package:genefilter':
##
## spec
## Loading required package: stringr
## Loading required package: dplyr
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:nlme':
##
## collapse
## The following object is masked from 'package:AnnotationDbi':
##
## select
## The following objects are masked from 'package:GenomicRanges':
##
## intersect, setdiff, union
## The following object is masked from 'package:GenomeInfoDb':
##
## intersect
## The following objects are masked from 'package:IRanges':
##
## collapse, desc, intersect, setdiff, slice, union
## The following objects are masked from 'package:S4Vectors':
##
## first, intersect, rename, setdiff, setequal, union
## The following object is masked from 'package:matrixStats':
##
## count
## The following object is masked from 'package:Biobase':
##
## combine
## The following objects are masked from 'package:BiocGenerics':
##
## combine, intersect, setdiff, union
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ lubridate::%within%() masks IRanges::%within%()
## ✖ dplyr::collapse() masks nlme::collapse(), IRanges::collapse()
## ✖ dplyr::combine() masks Biobase::combine(), BiocGenerics::combine()
## ✖ dplyr::count() masks matrixStats::count()
## ✖ dplyr::desc() masks IRanges::desc()
## ✖ tidyr::expand() masks S4Vectors::expand()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::first() masks S4Vectors::first()
## ✖ dplyr::lag() masks stats::lag()
## ✖ BiocGenerics::Position() masks ggplot2::Position(), base::Position()
## ✖ purrr::reduce() masks GenomicRanges::reduce(), IRanges::reduce()
## ✖ dplyr::rename() masks S4Vectors::rename()
## ✖ lubridate::second() masks S4Vectors::second()
## ✖ lubridate::second<-() masks S4Vectors::second<-()
## ✖ dplyr::select() masks AnnotationDbi::select()
## ✖ dplyr::slice() masks IRanges::slice()
## ✖ readr::spec() masks genefilter::spec()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE, fig.width=12, fig.height=8, fig.align = "center")## R version 4.4.0 (2024-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: AlmaLinux 8.10 (Cerulean Leopard)
##
## Matrix products: default
## BLAS/LAPACK: FlexiBLAS NETLIB; LAPACK version 3.11.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] lubridate_1.9.3 forcats_1.0.0
## [3] purrr_1.1.0 tidyr_1.3.1
## [5] tibble_3.2.1 tidyverse_2.0.0
## [7] IsoformSwitchAnalyzeR_2.4.0 pfamAnalyzeR_1.4.0
## [9] dplyr_1.1.4 stringr_1.5.1
## [11] readr_2.1.5 sva_3.52.0
## [13] genefilter_1.86.0 mgcv_1.9-1
## [15] nlme_3.1-164 satuRn_1.12.0
## [17] DEXSeq_1.50.0 RColorBrewer_1.1-3
## [19] AnnotationDbi_1.66.0 DESeq2_1.44.0
## [21] SummarizedExperiment_1.34.0 GenomicRanges_1.56.0
## [23] GenomeInfoDb_1.40.0 IRanges_2.38.0
## [25] S4Vectors_0.42.1 MatrixGenerics_1.16.0
## [27] matrixStats_1.5.0 Biobase_2.64.0
## [29] BiocGenerics_0.50.0 BiocParallel_1.38.0
## [31] limma_3.60.6 ggrepel_0.9.6
## [33] ggplot2_3.5.2
##
## loaded via a namespace (and not attached):
## [1] rstudioapi_0.16.0 jsonlite_2.0.0 tximport_1.32.0
## [4] magrittr_2.0.3 GenomicFeatures_1.56.0 farver_2.1.2
## [7] rmarkdown_2.29 BiocIO_1.14.0 zlibbioc_1.50.0
## [10] vctrs_0.6.5 locfdr_1.1-8 memoise_2.0.1
## [13] Rsamtools_2.20.0 RCurl_1.98-1.14 htmltools_0.5.8.1
## [16] S4Arrays_1.4.0 progress_1.2.3 AnnotationHub_3.12.0
## [19] lambda.r_1.2.4 curl_6.4.0 SparseArray_1.4.0
## [22] sass_0.4.10 bslib_0.9.0 plyr_1.8.9
## [25] httr2_1.0.1 futile.options_1.0.1 cachem_1.1.0
## [28] GenomicAlignments_1.40.0 lifecycle_1.0.4 pkgconfig_2.0.3
## [31] Matrix_1.7-0 R6_2.6.1 fastmap_1.2.0
## [34] GenomeInfoDbData_1.2.12 digest_0.6.37 colorspace_2.1-1
## [37] tximeta_1.22.1 geneplotter_1.82.0 RSQLite_2.3.6
## [40] hwriter_1.3.2.1 filelock_1.0.3 timechange_0.3.0
## [43] httr_1.4.7 abind_1.4-8 compiler_4.4.0
## [46] bit64_4.0.5 withr_3.0.2 DBI_1.2.2
## [49] biomaRt_2.60.1 rappdirs_0.3.3 DelayedArray_0.30.1
## [52] rjson_0.2.21 tools_4.4.0 glue_1.8.0
## [55] VennDiagram_1.7.3 restfulr_0.0.15 grid_4.4.0
## [58] reshape2_1.4.4 generics_0.1.4 gtable_0.3.6
## [61] BSgenome_1.72.0 tzdb_0.4.0 ensembldb_2.28.0
## [64] hms_1.1.3 xml2_1.3.6 XVector_0.44.0
## [67] BiocVersion_3.19.1 pillar_1.11.0 splines_4.4.0
## [70] BiocFileCache_2.12.0 lattice_0.22-6 survival_3.6-4
## [73] rtracklayer_1.64.0 bit_4.0.5 annotate_1.82.0
## [76] tidyselect_1.2.1 locfit_1.5-9.9 Biostrings_2.72.0
## [79] pbapply_1.7-2 knitr_1.50 gridExtra_2.3
## [82] ProtGenerics_1.36.0 edgeR_4.2.0 futile.logger_1.4.3
## [85] xfun_0.52 statmod_1.5.0 stringi_1.8.3
## [88] UCSC.utils_1.0.0 lazyeval_0.2.2 yaml_2.3.10
## [91] boot_1.3-30 evaluate_1.0.4 codetools_0.2-20
## [94] BiocManager_1.30.26 cli_3.6.5 xtable_1.8-4
## [97] jquerylib_0.1.4 dichromat_2.0-0.1 Rcpp_1.1.0
## [100] dbplyr_2.5.0 png_0.1-8 XML_3.99-0.16.1
## [103] parallel_4.4.0 blob_1.2.4 prettyunits_1.2.0
## [106] AnnotationFilter_1.28.0 bitops_1.0-9 pwalign_1.0.0
## [109] txdbmaker_1.0.1 scales_1.4.0 crayon_1.5.3
## [112] rlang_1.1.6 formatR_1.14 KEGGREST_1.44.0
# import counts
kallisto_counts_path <- "../kallisto_counts/"
kallisto_quant <- importIsoformExpression(
parentDir = kallisto_counts_path,
addIsofomIdAsColumn = TRUE
)
new_col_names <- c("isoform_id","bc01_CTRL", "bc02_CTRL", "bc03_CTRL", "bc04_CTRL", "bc09_AD", "bc10_AD", "bc11_AD", "bc12_AD")
names(kallisto_quant$abundance) <- new_col_names
names(kallisto_quant$counts) <- new_col_names
names(kallisto_quant$length) <- new_col_names
kallisto_quant$counts# generate design matrix
sampleID <- colnames(kallisto_quant$abundance[-1])
condition <- gsub('.*_', '', colnames(kallisto_quant$abundance[-1])) %>% as.factor()
design <- data.frame(
sampleID = sampleID,
condition = condition
)
# add sex and ageand race as covariates
design$sex <- factor(c('male','male','male','female','female','female','male','female'))
#design$race <- factor(c('Black','Black', 'White', 'White', 'White','White','Black','Black'))
designUsed GENCODE v48 .gtf file and transcripts .fa file
# create switchAnalyzeRlist object
gtf_file <- "../../refs/gencode.v48.annotation.gtf"
transcriptome_fasta <- "../../refs/gencode.v48.transcripts.fa.gz"
switch_analyzer_list <- importRdata(
isoformCountMatrix = kallisto_quant$counts,
isoformRepExpression = kallisto_quant$abundance,
designMatrix = design,
isoformExonAnnoation = gtf_file,
isoformNtFasta = transcriptome_fasta,
fixStringTieAnnotationProblem = TRUE,
showProgress = TRUE
)## | | | 0% | |=================================== | 50% | |======================================================================| 100%
## | | | 0% | |======================================================================| 100%
## comparison estimated_genes_with_dtu
## 1 AD vs CTRL 0 - 0
## This switchAnalyzeRlist list contains:
## 152394 isoforms from 39993 genes
## 1 comparison from 2 conditions (in total 8 samples)
##
## Feature analyzed:
## [1] "ORFs, ntSequence"
FDR < 0.05
dIFcutoff = 0.1
geneExpressionCutoff =1 FPKM/TPM/RPKM
isoformExpressionCutoff = 0 RPKM/FPKM (removes completely unused isoforms)
# filter:
# Multi-isoform genes
# Gene expression
# Isoform expression
# Isoform Fraction (isoform usage)
# Unwanted isoform classes
# Unwanted gene biotypes
# Genes without differential isoform usage
# default cutoffs:
# alpha=0.05 --> FDR
# dIFcutoff = 0.1 --> changes in absolute isoform usage / analogous to log2FC cut-off
# geneExpressionCutoff --> Default is 1 FPKM/TPM/RPKM
# isoformExpressionCutoff --> Default is 0 (which removes completely unused isoforms); also in RPKM/FPKM
switch_analyzer_list_filtered <- preFilter(
switchAnalyzeRlist = switch_analyzer_list,
geneExpressionCutoff = 1,
isoformExpressionCutoff = 0,
removeSingleIsoformGenes = TRUE
)FDR < 0.05
dIFcutoff = 0.1
reduceToSwitchingGenes = TRUE (only uses genes w at least 1 significant differentially used isoform)
# switch analysis with DEXseq
# default cutoffs:
# alpha=0.05 --> FDR
# dIFcutoff = 0.1 --> changes in absolute isoform usage / analogous to log2FC cut-off
# reduceToSwitchingGenes --> only uses genes w at least 1 sig differentially used isoform
switch_analyzer_list_filtered <- isoformSwitchTestDEXSeq(
switchAnalyzeRlist = switch_analyzer_list_filtered,
reduceToSwitchingGenes=TRUE
)
extractSwitchSummary(switch_analyzer_list_filtered)# extract switch amino acid and nt sequences to files
switch_analyzer_list_filtered <- extractSequence(
switch_analyzer_list_filtered,
pathToOutput = "results_kallisto_isoformanalyzer",
writeToFile=TRUE
)Arranged by isoform_switch_q_value
PTC = pre-mature termination codon
# PTC means --> pre-mature termination codons
switch_analyzer_list_df <- switch_analyzer_list_filtered$isoformFeatures %>%
as.data.frame() %>%
arrange(isoform_switch_q_value)
switch_analyzer_list_dfAdd padj values
# add differential gene and isoform expression to switchanalyzer object
diff_gene_expr <- read.csv("../differential_expression/kallisto_DGE.csv")
diff_iso_expr <- read.csv("../differential_expression/kallisto_differential_isoform_expression.csv")
switch_analyzer_list_df <- switch_analyzer_list_df %>%
left_join(
diff_gene_expr %>% dplyr::select(gene_name, padj),
by = join_by(gene_id == gene_name)) %>%
mutate(gene_q_value = padj) %>%
subset(select = -padj) %>%
left_join(
diff_iso_expr %>% dplyr::select(isoform_id, padj),
by = "isoform_id") %>%
mutate(iso_q_value = padj) %>%
subset(select = -padj)
switch_analyzer_list_filtered$isoformFeatures$gene_q_value <- switch_analyzer_list_df$gene_q_value
switch_analyzer_list_filtered$isoformFeatures$iso_q_value <- switch_analyzer_list_df$iso_q_value
# save switch_analyzer_list_filtered as .RDS file for future use
#saveRDS(switch_analyzer_list_filtered, file = "switch_analyzer_list_filtered_sex.rds")# significant isoform switches (iso_switch_q_value < 0.05)
switch_analyzer_list_filtered <- readRDS("./switch_analyzer_list_filtered_sex.rds")
switch_analyzer_list_df <- switch_analyzer_list_filtered$isoformFeatures %>% as.data.frame()
significant_iso_switches <- switch_analyzer_list_df %>%
dplyr::filter(isoform_switch_q_value < 0.05) %>%
arrange(isoform_switch_q_value) %>%
dplyr::select(isoform_id, gene_id, condition_1, condition_2,iso_biotype, IF1, IF2, dIF, isoform_switch_q_value, gene_switch_q_value, PTC)reference is AD
isoform_switch_q_value < 0.05
dIF < -0.1 or > 0.1
# volcano-like plot of isoform switches AD vs CTRL
ggplot(data = switch_analyzer_list_df, aes(x = dIF, y = -log10(isoform_switch_q_value))) +
geom_point(
aes(color = abs(dIF) > 0.1 & isoform_switch_q_value < 0.05),
size = 1
) +
geom_hline(yintercept = -log10(0.05), linetype = 'dashed') +
geom_vline(xintercept = c(-0.1, 0.1), linetype = 'dashed') +
scale_color_manual('Significant\nIsoform Switch',
values = c('FALSE' = 'blue', 'TRUE' = 'orange')) +
labs(x = 'dIF', y = '-Log10 (Isoform Switch Q Value)') +
ggtitle("Significant Isoform Switches AD vs. CTRL (Q value < 0.05)") +
ggrepel::geom_text_repel(
aes(label = ifelse(abs(dIF) > 0.1 & isoform_switch_q_value < 0.05,
paste0(gene_id, " (", isoform_id, ")"),
'')),
size = 2.5
) +
theme_bw()signal peptides, protein topology, subcellular locations, and protein domains
# SignalP - Prediction of Signal Peptides
# https://services.healthtech.dtu.dk/services/SignalP-6.0/
switch_analyzer_list_filtered <- analyzeSignalP(
switchAnalyzeRlist = switch_analyzer_list_filtered,
pathToSignalPresultFile = "./biolib_results/signalp/prediction_results.txt"
)
# analyzeDeepTMHMM : Prediction of protein topology — the proteins location compared to the cell membrane. Can be intracellular, transmembrane (TM) or extracellular.
# https://dtu.biolib.com/results/0797323f-c818-4343-88d7-43009062fc59/?token=ka2OLAD9XhWPRFIO
switch_analyzer_list_filtered <- analyzeDeepTMHMM(
switchAnalyzeRlist = switch_analyzer_list_filtered,
pathToDeepTMHMMresultFile = "./biolib_results/deeptmhmm/TMRs.gff3",
showProgress=TRUE
)## | | | 0% | |= | 1% | |== | 2% | |== | 3% | |=== | 4% | |==== | 5% | |===== | 6% | |===== | 8% | |====== | 9% | |======= | 10% | |======== | 11% | |======== | 12% | |========= | 13% | |========== | 14% | |=========== | 15% | |=========== | 16% | |============ | 17% | |============= | 18% | |============== | 19% | |============== | 20% | |=============== | 22% | |================ | 23% | |================= | 24% | |================= | 25% | |================== | 26% | |=================== | 27% | |==================== | 28% | |==================== | 29% | |===================== | 30% | |====================== | 31% | |======================= | 32% | |======================= | 33% | |======================== | 34% | |========================= | 35% | |========================== | 37% | |========================== | 38% | |=========================== | 39% | |============================ | 40% | |============================= | 41% | |============================= | 42% | |============================== | 43% | |=============================== | 44% | |================================ | 45% | |================================ | 46% | |================================= | 47% | |================================== | 48% | |=================================== | 49% | |=================================== | 51% | |==================================== | 52% | |===================================== | 53% | |====================================== | 54% | |====================================== | 55% | |======================================= | 56% | |======================================== | 57% | |========================================= | 58% | |========================================= | 59% | |========================================== | 60% | |=========================================== | 61% | |============================================ | 62% | |============================================ | 63% | |============================================= | 65% | |============================================== | 66% | |=============================================== | 67% | |=============================================== | 68% | |================================================ | 69% | |================================================= | 70% | |================================================== | 71% | |================================================== | 72% | |=================================================== | 73% | |==================================================== | 74% | |===================================================== | 75% | |===================================================== | 76% | |====================================================== | 77% | |======================================================= | 78% | |======================================================== | 80% | |======================================================== | 81% | |========================================================= | 82% | |========================================================== | 83% | |=========================================================== | 84% | |=========================================================== | 85% | |============================================================ | 86% | |============================================================= | 87% | |============================================================== | 88% | |============================================================== | 89% | |=============================================================== | 90% | |================================================================ | 91% | |================================================================= | 92% | |================================================================= | 94% | |================================================================== | 95% | |=================================================================== | 96% | |==================================================================== | 97% | |==================================================================== | 98% | |===================================================================== | 99% | |======================================================================| 100%
# analyzeDeepLoc2 - prediction of sub-cellular localization(s) of protein
# deeploc2.0
# https://services.healthtech.dtu.dk/services/DeepLoc-2.0/
switch_analyzer_list_filtered <- analyzeDeepLoc2(
switchAnalyzeRlist = switch_analyzer_list_filtered,
pathToDeepLoc2resultFile = "./biolib_results/deeploc2/results_6894F7A60000BCEA626E8931.csv",
quiet = FALSE
)
# pfam
# see /bin/isoform_switch_analyzer/biolib_results/pfam_hmmscan/ directory for shell script
switch_analyzer_list_filtered <- analyzePFAM(
switchAnalyzeRlist = switch_analyzer_list_filtered,
pathToPFAMresultFile = "./biolib_results/pfam_hmmscan/Pfam_result.txt",
showProgress=TRUE
)## | | | 0% | |= | 2% | |=== | 4% | |==== | 6% | |===== | 7% | |====== | 9% | |======== | 11% | |========= | 13% | |========== | 15% | |============ | 17% | |============= | 19% | |============== | 20% | |================ | 22% | |================= | 24% | |================== | 26% | |=================== | 28% | |===================== | 30% | |====================== | 31% | |======================= | 33% | |========================= | 35% | |========================== | 37% | |=========================== | 39% | |============================= | 41% | |============================== | 43% | |=============================== | 44% | |================================ | 46% | |================================== | 48% | |=================================== | 50% | |==================================== | 52% | |====================================== | 54% | |======================================= | 56% | |======================================== | 57% | |========================================= | 59% | |=========================================== | 61% | |============================================ | 63% | |============================================= | 65% | |=============================================== | 67% | |================================================ | 69% | |================================================= | 70% | |=================================================== | 72% | |==================================================== | 74% | |===================================================== | 76% | |====================================================== | 78% | |======================================================== | 80% | |========================================================= | 81% | |========================================================== | 83% | |============================================================ | 85% | |============================================================= | 87% | |============================================================== | 89% | |================================================================ | 91% | |================================================================= | 93% | |================================================================== | 94% | |=================================================================== | 96% | |===================================================================== | 98% | |======================================================================| 100%
switch_analyzer_list_filtered$signalPeptideAnalysis %>% dplyr::left_join(switch_analyzer_list_filtered$isoformFeatures %>% dplyr::select(isoform_id, gene_name), by = join_by(isoform_id)) %>% dplyr::select(isoform_id, gene_name, everything())switch_analyzer_list_filtered$topologyAnalysis %>% dplyr::left_join(switch_analyzer_list_filtered$isoformFeatures %>% dplyr::select(isoform_id, gene_name), by = join_by(isoform_id)) %>% dplyr::select(isoform_id, gene_name, everything())switch_analyzer_list_filtered$subCellLocationAnalysis %>% dplyr::left_join(switch_analyzer_list_filtered$isoformFeatures %>% dplyr::select(isoform_id, gene_name), by = join_by(isoform_id)) %>% dplyr::select(isoform_id, gene_name, everything())Sorted by E_value (significance)
switch_analyzer_list_filtered$domainAnalysis %>% dplyr::left_join(switch_analyzer_list_filtered$isoformFeatures %>% dplyr::select(isoform_id, gene_name), by = join_by(isoform_id)) %>% dplyr::select(isoform_id, gene_name, everything()) %>% arrange(E_value)# isoform switch visualization
top_switches <- extractTopSwitches(
switch_analyzer_list_filtered,
filterForConsequences = FALSE,
n = NA, # n=NA: all features are returned
extractGenes = FALSE, # when FALSE isoforms are returned
sortByQvals = TRUE,
)
plot_isoform_switch <- function(isoform_switch_list, gene) { switchPlot( isoform_switch_list, gene=gene, condition1 = 'AD', condition2 = 'CTRL', plotTopology = TRUE ) }
for (gene in top_switches$gene_id) { plot_isoform_switch(switch_analyzer_list_filtered, gene) }Reference is AD
# alternative splicing analysis
switch_analyzer_list_filtered <- analyzeAlternativeSplicing(
switchAnalyzeRlist = switch_analyzer_list_filtered,
quiet=TRUE
)
switch_analyzer_list_filtered$AlternativeSplicingAnalysis# switch consequences
consequencesOfInterest <- c(
'intron_retention',
'intron_structure',
'ORF_genomic',
'ORF_length',
'NMD_status',
'signal_peptide_identified',
'last_exon',
'exon_number',
'tss',
'tts',
'isoform_seq_similarity',
'isoform_length',
"sub_cell_location",
"sub_cell_shift_to_cell_membrane",
"sub_cell_shift_to_cytoplasm",
"sub_cell_shift_to_nucleus",
"sub_cell_shift_to_Extracellular",
"isoform_topology",
"extracellular_region_count",
"intracellular_region_count",
"extracellular_region_length",
"intracellular_region_length",
'5_utr_seq_similarity',
'5_utr_length',
'3_utr_seq_similarity',
'3_utr_length',
'domains_identified',
'domain_isotype',
'domain_length',
'genomic_domain_position'
)
switch_analyzer_list_filtered <- analyzeSwitchConsequences(
switch_analyzer_list_filtered,
consequencesToAnalyze = consequencesOfInterest,
dIFcutoff = 0.1,
showProgress=TRUE
)## | | | 0% | |=== | 4% | |===== | 8% | |======== | 12% | |=========== | 15% | |============= | 19% | |================ | 23% | |=================== | 27% | |====================== | 31% | |======================== | 35% | |=========================== | 38% | |============================== | 42% | |================================ | 46% | |=================================== | 50% | |====================================== | 54% | |======================================== | 58% | |=========================================== | 62% | |============================================== | 65% | |================================================ | 69% | |=================================================== | 73% | |====================================================== | 77% | |========================================================= | 81% | |=========================================================== | 85% | |============================================================== | 88% | |================================================================= | 92% | |=================================================================== | 96% | |======================================================================| 100%
### Extract top switching genes (by q-value)
top_switches_q_value <- extractTopSwitches(
switch_analyzer_list_filtered,
filterForConsequences = TRUE,
sortByQvals = TRUE
)
top_switches_q_value### Extract top switching genes (by dIF values)
top_switches_dif_value <- extractTopSwitches(
switch_analyzer_list_filtered,
filterForConsequences = TRUE,
sortByQvals = FALSE
)
top_switches_dif_valueReference is AD
extractConsequenceEnrichment(
switch_analyzer_list_filtered,
consequencesToAnalyze="all",
analysisOppositeConsequence = TRUE,
localTheme = theme_bw(base_size = 14),
returnResult = FALSE # if TRUE returns a data.frame with the summary statistics
)